116 research outputs found
Lifetime Income and Housing Affordability in Singapore
The existing measures of housing affordability are essentially short-run indicators that compare current income with property prices. Taking into consideration that a housing purchase is a long-horizon decision and the property price reflects the discounted present value of future mortgage payments, we develop a housing affordability index as the ratio of lifetime income to housing price. Lifetime income is computed by obtaining the predicted income from a regression over the working life from age 20 to 64 for each birth cohort for which limited data were available. Lifetime income of Singapore households by three income quantiles (lower, median, and upper quartiles) shed new light on the increasing income inequality. The affordability index reveals informative trends and cycles in housing affordability both in the public and private sectors. We argue why residential property price escalations need to be avoided by showing that such price increases do not necessarily create a net wealth effect for the aggregate of householdslifetime income inequality, long-run housing affordability, wealth effect, price effect
The Cultural Revolution, Stress and Cancer
The link between mental stress and cancer is still a belief, not a well established scientific fact. Scientists have relied largely on opinions of cancer stricken patients to establish a link between stress and cancer. Such opinion surveys tend to produce contradictory statistical inferences. Although it is difficult to conduct scientific experiments on humans similar to those on animals, human history is replete with “experiments” that have caused enormous stress on some human populations. The objective of this exercise is to draw evidence from one such massive experiment, the Cultural Revolution in China. Cancer data from Shanghai analyzed through an age-period-cohort technique show very strong evidence in support of the hypothesis that mental stress causes cancer.
Does the IV estimator establish causality? Re-examining Chinese fertility-growth relationship
The instrumental variable (IV) estimator in a cross-sectional or panel regression model is often taken to provide valid causal inference from contemporaneous correlations. In this exercise we point out that the IV estimator, like the OLS estimator, cannot be used effectively for causal inference without the aid of non-sample information. We present three possible cases (lack of identification, accounting identities, and temporal aggregation) where IV estimates could lead to misleading causal inference. In other words, a non-zero IV estimate does not necessarily indicate a causal effect nor does the causal direction. In this light, we re-examine the relationship between Chinese provincial birth rates and economic growth. This exercise highlights the potential pitfalls of using too much temporal averaging to compile the data for cross sectional and panel regressions and the importance of estimating both (x on y and y on x) regressions to avoid misleading causal inferences. The GMM-SYS results from dynamic panel regressions based on five-year averages show a strong negative relationship running both ways, from births to growth and growth to births. This outcome, however, changes to a more meaningful one-way relationship from births to growth if the panel analysis is carried out with the annual data. Although falling birth rates in China have enhanced the country’s growth performance, it is difficult to attribute this effect solely to the one-child policy implemented after 1978.IV estimator and causality inference, identification, accounting identities, temporal aggregation, spurious causality, Chinese provincial growth and fertility relationship.
Testing for Homogeneity in Mixture Models
Statistical models of unobserved heterogeneity are typically formalized as
mixtures of simple parametric models and interest naturally focuses on testing
for homogeneity versus general mixture alternatives. Many tests of this type
can be interpreted as tests, as in Neyman (1959), and shown to be
locally, asymptotically optimal. These tests will be contrasted
with a new approach to likelihood ratio testing for general mixture models. The
latter tests are based on estimation of general nonparametric mixing
distribution with the Kiefer and Wolfowitz (1956) maximum likelihood estimator.
Recent developments in convex optimization have dramatically improved upon
earlier EM methods for computation of these estimators, and recent results on
the large sample behavior of likelihood ratios involving such estimators yield
a tractable form of asymptotic inference. Improvement in computation efficiency
also facilitates the use of a bootstrap methods to determine critical values
that are shown to work better than the asymptotic critical values in finite
samples. Consistency of the bootstrap procedure is also formally established.
We compare performance of the two approaches identifying circumstances in which
each is preferred
Learning Large-Scale Bayesian Networks with the sparsebn Package
Learning graphical models from data is an important problem with wide
applications, ranging from genomics to the social sciences. Nowadays datasets
often have upwards of thousands---sometimes tens or hundreds of thousands---of
variables and far fewer samples. To meet this challenge, we have developed a
new R package called sparsebn for learning the structure of large, sparse
graphical models with a focus on Bayesian networks. While there are many
existing software packages for this task, this package focuses on the unique
setting of learning large networks from high-dimensional data, possibly with
interventions. As such, the methods provided place a premium on scalability and
consistency in a high-dimensional setting. Furthermore, in the presence of
interventions, the methods implemented here achieve the goal of learning a
causal network from data. Additionally, the sparsebn package is fully
compatible with existing software packages for network analysis.Comment: To appear in the Journal of Statistical Software, 39 pages, 7 figure
Penalized Estimation of Directed Acyclic Graphs From Discrete Data
Bayesian networks, with structure given by a directed acyclic graph (DAG),
are a popular class of graphical models. However, learning Bayesian networks
from discrete or categorical data is particularly challenging, due to the large
parameter space and the difficulty in searching for a sparse structure. In this
article, we develop a maximum penalized likelihood method to tackle this
problem. Instead of the commonly used multinomial distribution, we model the
conditional distribution of a node given its parents by multi-logit regression,
in which an edge is parameterized by a set of coefficient vectors with dummy
variables encoding the levels of a node. To obtain a sparse DAG, a group norm
penalty is employed, and a blockwise coordinate descent algorithm is developed
to maximize the penalized likelihood subject to the acyclicity constraint of a
DAG. When interventional data are available, our method constructs a causal
network, in which a directed edge represents a causal relation. We apply our
method to various simulated and real data sets. The results show that our
method is very competitive, compared to many existing methods, in DAG
estimation from both interventional and high-dimensional observational data.Comment: To appear in Statistics and Computin
- …